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ABSTRACT 

We present TreeCol, a new and efficient tree-based scheme to calculate column den- 
sities in numerical simulations. Knowing the column density in any direction at any 
location in space is a prerequisite for modeling the propagation of radiation through 
the computational domain. TreeCol therefore forms the basis for a fast, approximate 
method for modelling the attenuation of radiation within large numerical simulations. 
It constructs a HEALPix sphere at any desired location and accumulates the column 
density by walking the tree and by adding up the contributions from all tree nodes 
whose line of sight contributes to the pixel under consideration. In particular when 
combined with widely-used tree-based gravity solvers the new scheme requires little 
additional computational cost. In a simulation with N resolution elements, the com- 
putational cost of TreeCol scales as NlogN, instead of the N^^^ scaling of most other 
radiative transfer schemes. TreeCol is naturally adaptable to arbitrary density distri- 
butions and is easy to implement and to parallelize, particularly if a tree structure is 
already in place for calculating the gravitational forces. We describe our new method 
and its implementation into the SPH code Gadget 2. We discuss its accuracy and per- 
formance characteristics for the examples of a spherical protostellar core and for the 
turbulent interstellar medium. We find that the column density estimates provided by 
TreeCol are on average accurate to better than 10 percent. In another application, we 
compute the dust temperatures for solar neighborhood conditions and compare with 
the result of a full-fiedged Monte Carlo radiation-transfer calculation. We find that 
both methods give very similar answers. We conclude that Tree Co /provides a fast, easy 
to use, and sufficiently accurate method of calculating column densities that comes 
with little additional computational cost when combined with an existing tree-based 
gravity solver. 
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1 INTRODUCTION 

The penetration of radiation into an optically thick distri- 
bution of gas is a feature of many astrophysical systems, 
ranging from scales as small as those of circumstellar disks 
to those as large as the damped Lyman-a absorbers observed 
along the sight lines to many quasars. Numerical modelling 
of the propagation of radiation through the gas can greatly 
aid our efforts to understand the astrophysics of these sys- 
tems, but frequently proves to be computationally challeng- 
ing, owing to the high dimensionality of the problem. In the 
common case in which we have no useful spatial symme- 
tries to exploit and wish to solve for the properties of the 
radiation field within Ni, different frequency bins, the com- 
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putational cost of determining the full spatial and angular 
distribution of the radiation field is of order N^^^ x N^, 
where N is the number of resolution elements (e.g. grid cells 
in an Eulerian simulation, or particles in a smoothed particle 
hydrodynamics [SPH] model), and where we have assumed 
that the desired angular resolution is comparable to the spa- 
tial resolution. For static problems, where the gas distribu- 
tion is fixed and we need only to solve for the properties of 
the radiation field at a single point in time, it is currently 
possible to solve the full radiative transfer problem numer- 



ically even for relatively large values of N (see e.g. Rundle 
|et al.|[2'010[ who post-process the results of an SPH simu- 
lation with N = 3.5 X 10^). However, if one is interested 
in dynamical problems, where the gas distribution is not 
fixed and the gas and radiation significantly infiuence one 
another, then the cost of solving for the radiation field af- 
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ter every single hydro dynamical timestep can easily become 
prohibitively large (for a detailed discussion, see Klessen et 
aL][20TTt 



For this reason, it is useful to look for simpler, more 
approximate techniques for treating the radiation that have 
a much lower computational cost, and that can therefore be 
used within hydrodynamical simulations without rendering 
these simulations overly expensive. One common simplifica- 
tion that nevertheless has a reasonably broad range of ap- 
plicability is to ignore the re-emission of incident radiation 
within the gas. Making this simplification means that rather 
than solving the full transfer equation. 



ds 



—^=r)^- X.I. 



(1) 



along multiple rays through the gas, where Ii^ is the specific 
intensity at a frequency zy, r]u and Xi^ ^he emissivity and 
opacity at the same frequency, and s is the path length along 
the ray, one can instead solve the simpler equation. 



ds 



(2) 



Equation [T] has the formal solution 

lu = /i.,oe~^'' + / Sue'^^^'dri, (3) 

^0 

where I^^q is the specific intensity of the radiation field at the 
start of the ray (e.g. at the edge of a gas cloud), = Vi^/Xi^ 
is the source function, and Tu is the optical depth along the 
ray. If <^ and the optical depth is not too large, then 
it is reasonable to neglect the integral term, in which case 
we can write Ij, as 



(4) 



which is the formal solution to Equation [2] By making this 
approximation, we therefore reduce the problem to one of 
determining optical depths along a large number of rays. 
Often, this problem can then be further reduced to one of 
determining the column density of some absorber (e.g. dust) 
along each ray. 

Unfortunately, although these simplifications make the 
problem easier to handle numerically, they do not go far 
enough, as the most obvious technique for calculating the 
column densities - integrating along each ray - still has a 
computational cost that scales as N^^^ and hence is imprac- 
tical in large simulations. This motivates one to look for 
computationally cheaper methods for determining the angu- 
lar distribution of column densities seen by each resolution 
element within a large numerical simulation. 

In this paper, we introduce a computationally cheap 
and acceptably accurate method for computing these col- 
umn densities, suitable for use within simulations of self- 
gravitating gas that utilize a tree-based solver for calculat- 
ing gravitational forces. Our method, which we dub TreeCol, 
makes use of the large amount of information on the den- 
sity distribution of the gas that is already stored within the 
tree structure to accelerate the calculation of the required 
column density distributions. 

In the next section, we give a description of how our 
algorithm works, starting with a overview of how tree-based 
gravity solvers work in Section [2.1[ and then showing how 
it is possible to implement the TreeCol method in Section 
|2.2| We then present two stringent tests of the algorithm in 
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Figure 1. Schematic diagram showing how the tree is constructed 
and used for the gravitational force calculation. A 3D oct-tree 
splits each parent node into eight daughter nodes, but in this 
2D representation, we show only four of these nodes. The black 
lines show the boundaries of the tree nodes that would be con- 
structed for the given ensemble of particles, shown as blue dots. 
The regions shaded in red denote the nodes that would be used 
to calculate the gravitational force as seen by the large blue and 
orange particle at the bottom of the diagram. Note that in the 
case where the nodes being used contain only one particle (a 'leaf 
node), the position of the particle itself is used to calculate the 
gravitational force arising from that node. 



Section |3] both of which are typical of the conditions found 
in contemporary simulations of star formation. We discuss 
some of the potential applications of the TreeCol method 
in Section [4] In Section |5] we give an overview of the com- 
putational efficiency of this scheme, and we summarise this 
paper in Section |6] 



2 TREECOL 

2.1 Basic idea behind TreeCol 



Tree-based gravity solvers (e.g. Barnes Sz Hut| p^86| 1989) 
have long been a standard feature of iV-body and smoothed 
particle hydrodynamics codes (e.g. Benz|[l988^ Vine k, Sig- 



urdsson 



1998 



Springel et al. 



2001 



Wadsley et al. 2004). 



More recently, their accuracy and speed has also seen them 
adopted in grid-based codes (|Dale et al.||2009). In this pa- 
per, we describe a method whereby the information stored 
in the gravitational tree can be used to construct a An stera- 
dian map of the column density. By constructing this map 
at the same time as the tree is being "walked" to deter- 
mine the gravitational forces, we can minimize the amount 
of additional communication necessary between CPUs hold- 
ing different portions of the tree. Since the structure of the 
tree, and how it is walked, will be important for our discus- 
sion, we will first give a brief overview of how a tree-based 
gravity solver works. For the purpose of this discussion, we 
consider a solver based on an oct-tree, as used in e.g. the 



Gadget SPH code (Springel 2005), although we note that 



solvers based on other tree structures, such as binary trees, 
do exist (e.g. the binary tree employed by Benz| 198*8 which 
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Figure 2. Schematic diagram illustrating the TreeCol concept. 
During the tree walk to obtain the gravitational forces, the pro- 
jected column densities of the tree nodes (the boxes shown on the 
right) are mapped onto a spherical grid surrounding the particle 
for which the forces are being computed (the "target" particle, 
shown on the left). The tree already stores all of the information 
necessary to compute the column density of each node, the posi- 
tion of the node in the plane of the sky of the target particle, and 
the angular extent of the node. This information is used to com- 
pute the column density map at the same time that the tree is 
being walked to calculate the gravitational forces. Provided that 
the tree is already employed for the gravity calculation, the in- 
formation required to create the 47r steradian map of the column 
densities can be obtained for minimal computational cost. 



later found its way into other high profile studies, such as 
Bonnell et al.|1998| and [Bate et al.|2Q03t . 

A tree-based solver starts by constructing a tree, split- 
ting the computational volume up into a series of nested 
boxes, or 'nodes'. The 'root' node is the largest in the hi- 
erarchy and contains all of the computational points in the 
simulation. This large 'parent' node is then split up into 
eight smaller 'daughter' nodes as shown in Figure [l] The 
daughter nodes are further refined (becoming parents them- 
selves) until each tree node contains only one particle (illus- 
trated in Figure [l] by the blue dots). These smallest nodes 
at the very bottom of the hierarchy are typically termed 
'leaves'. At each point in the hierarchy, the tree stores the 
information about the contents of the parent node (includ- 
ing its position, mass and size) that will be needed during 
the gravitational force calculation. Once the construction of 
the tree is complete, each particle is located in a leaf node 
situated at the bottom of a nested hierarchy of other nodes. 

Once the tree is built, it can then be "walked" to get 
the gravitational forces. The idea behind the speed-up of- 
fered by the tree gravity solver over direct summation is 
very simple: any region of structured mass that is far away 
can be well approximated as a single, unstructured object, 
since the distances to each point in the structure are essen- 
tially the same. Strictly, this is only true if the angular size 
of the structure is small, and so tree-codes tend to adopt 
an angle, rather than a distance, for testing whether or not 
structures can be approximated. This angle is often referred 
to as the "opening angle" of the tree, and we will denote it 
hereafter as ^toi- 

To walk the tree to obtain the gravitational force on 
a given particle, the algorithm starts at the root node and 
opens it up, testing whether the daughter nodes subtend 



an angle of less than ^toi- If the angle is smaller than ^toi, 
the properties of the daughter nodes (mass, position, cen- 
tre of mass) are used to calculate their contribution to the 
force. As such, any substructure within the daughter nodes 
is ignored, and the mass inside in the nodes is assumed to be 
uniformly distributed within their boundaries. If one or more 
of these nodes subtends an angle larger than ^toi, the nodes 
are opened and the process is repeated on their daughter 
nodes, and so on, until nodes are found that appear smaller 
than ^toi- To increase the accuracy of the force calculation, 
the nodes often store multipole moments that account for 
the fact that the node is not a point mass, but rather a 
distributed object that subtends some finite angle (e.g. see 



Binney & Tremaine 1987). These moments are calculated 



during the tree construction, for all levels of the node hier- 
archy except the leaves, since these are either well approxi- 
mated as point masses - as is the case for a stellar A^-body 
calculation - or are SPH particles, which have their own 



prescription for how they are distributed in space (Bate et 
ar||T995t . 



The above method is sketched in Figure [T] which shows 
the tree structure in black, and the nodes, marked in red, 
that would be used to evaluate the gravitational force on 
the large blue particle with the orange highlight. In the 
cases where the nodes are leaves (containing only a single 
particle), the position of the particle itself is used. As the 
total number of force calculations can be substantially de- 
creased in comparison to the number required when using 
direct summation, tree-based gravity solvers offer a consider- 
able speed-up at the cost of a small diminution in accuracy. 
[Barnes Hut| ( |1989| ) showed that for a distribution of N 
self- gravitating particles, the computational cost of a tree- 
based solver scales as N log compared to the N'^ scaling 
associated with direct summation. They also showed that 
the multipole moments allowed quite large opening angles, 
with ^toi values as large as 0.5 radians resulting in errors of 
less than a percent. 

Our TreeCol method makes use of the fact that each 
node in the tree stores the necessary properties for construct- 
ing a column density map. The mass and size of the node 
can be used to calculate the column density of the node, and 
its position and apparent angular size allow us to determine 
the region on the sky that is covered by the node. Note also 
that column density, just like the total gravitational force, is 
a simple sum over the contributing material, meaning that 
it is independent of the order in which the contributions are 
gathered. Just as the tree allows us to construct a force for 
each particle, we can also sum up the column density con- 
tributions of the nodes to create a 47r steradian map of the 
column density during the tree-walk. 

A schematic diagram of how this works is shown in Fig- 
ure [2] The target particle - the one currently walking the 
tree, and for which the map is being created - is shown as 
the large dark blue particle on the left. Around it we show 
the spherical grid onto which the column densities are to 
be mapped. We see that the tree nodes, shown on the right, 
subtend some angle (which is less than some adopted ^toi), 
and cover different pixels on the spherical grid. During the 
tree walk, the TreeCol method simply maps the projection 
of the nodes onto the pixels for the particle being walked. 
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Figure 3. Schematic showing the overlap between a pixel on 
the SPH particle's HEALPix sphere, and the tree node. The an- 
gular size of the pixels and nodes are denoted by 2rp and 2rn, 
respectively, and the distances between their centres are given by 
the orthogonal angles Rq and i?^. The diagram shows the case 
when the angle subtended by the tree node is greater than that 
of the pixels, and the two possible situations that can arise: a) 
the pixel and tree node only partially overlap, and b) the pixel is 
entirely covered by the tree node. In the former case, we work out 
the mass in the overlapping area, and convert it to a column den- 
sity contribution by smearing it over the pixel's area. In the latter 
case, the pixel just obtains the full column density of the node. In 
the case where the angle subtended by the pixels is greater than 
the tree node (not shown here), then obviously the tree node can 
also become totally covered by the pixel. In this case, the full 
mass of the node is smeared out over the pixel's area to define 
the column density contribution. Full details of how the mapping 
is done in this implementation are given in Section |2.2| 
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Figure 4. Illustration showing how the nodes and pixels are 
assumed to interact in our implementation of the TreeCol algo- 
rithm. For each tree node, a new co-ordinate system is created, 
in which the node's position vector is the x-axis. The angular 
distance between the node centres, can then be described by two 
orthogonal angles, Rq and i?^, which allows us to define an over- 
lap area d(j)d6. Note that nodes and pixels have an are a (2 rn)^ 
and (2rp)'^ respectively. Full details are given in Section [2^ 



2.2 A simple implementation of TreeCol 

The details of exactly how the nodes are mapped onto the 
grid depends on how accurate one needs the column den- 
sity information to be. However, it should be noted that the 
tree structure is only an approximate representation of the 
underlying gas structure: it distributes the mass in a some- 
what larger volume than is actually the case, and as a result, 
sharp edges tend to be displaced to the boundary of node. 
As such, column densities from the tree will always be ap- 
proximate, and so a highly accurate mapping of the node 
column density projections is computationally wasteful. In 
what follows, we will describe a simple implementation of 
TreeCol that is both reasonably accurate while at the same 
time requiring minimal computational cost. 

Our mapping of the tree nodes to the pixels makes a 
number of assumptions regarding the shape and projection 
of the nodes and the pixels. These are: 

• The tree nodes are always seen as squares in the sky, 
regardless of their actual orientation. 

• The nodes are assumed to overlap the pixels as shown 
in Figure |4] such that we can define the overlapping region 
based on simple orthogonal co-ordinates in the plane of the 
sky. 

• We use the if^ALPz^Q algorithm (| G6rski et a l.|2005| to 
compute pixels that are equidistant on the sphere's surface 
and that have equal areas. 

We show a schematic diagram of the way the nodes are 
assumed to overlap in Figure [3] The tree nodes are taken 
to be squares with side length 2rn and likewise, the pixels 
onto which the column densities mapped are assumed to 
be squares with side length 2rp. As shown in the diagram, 
these dimensions are assumed to be equivalent to the angles 
subtended by the nodes and the pixels. Overlap requires that 



Re <rp + rn 
and 



(5) 



(6) 



If this is the case, the lengths, dO and dcj) describing the 
overlapping area are then given by 



= min{(rp + rn - Re), 2rp} 



and 



min{(rp + rn - Re/)), 2rp}, 



(7) 



(8) 



when the pixels have a smaller angular extent than the nodes 
(i.e. rp < rn), or 



min{(rp + rn - i^e), 2rn} 



and 



' = min{(rp + rn - i?^), 2rn}, 



(9) 



(10) 



when the nodes have a smaller angular extent than the pixels 
(i.e. rn < rp). By taking the minimum of the expression 
(rp + rn — Re,cf)) and either the node or pixel side length, 
we account for situations such as those shown in the right- 
hand panel in Figure |3] in which either the pixel is totally 
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covered by the node, or the node is totally contained within 
the pixel. We can then calculate the contribution of the node 
to the pixel's column density from 

(11) 



4r2 



This expression is formed by considering the mass in 
the overlapping area given by dOdcj). If the pixel is totally 
covered by the node, then it gets the full column density of 
the node. If the pixel is only partially covered by the node, 
then the mass in the overlapping region is smeared out over 
the area of the pixel, to create a new column density. If the 
node is totally contained within the pixel, then obviously all 
the mass from the node is smeared out over the pixel's area. 

Clearly, the ability of and (j) to describe the overlap- 
ping area breaks down near the poles, with the extreme case 
where a pixel directly over either of the poles cannot be 
described by a dcj). To account for this, we move to a co- 
ordinate system in which the tree node's position vector, n 
describes a new x-axis (x^), such that the node is always 
located at (1, 0, 0). To define the other two axis (the new y 
and z axis), we first define a control vector, a, that is close 
to the node's position vector, displaced by a small amount 
in and (j). For the displacement we choose (somewhat ar- 
bitrarily) that decreases by |rp for ^ < f and increases 
by |rp for > and that (j) always increases by |rp We 
then define the new axis vectors from: 



X = n 
b = x^ X a 
/ b 

z = X X y 



(12) 
(13) 
(14) 

(15) 



The pixel unit vectors p are then rotated into this co- 
ordinate system simply by taking the scalar product with 
each of the axes: 



p X 

p y' 

p z . 



(16) 
(17) 
(18) 



A schematic diagram of how the pixels and nodes are defined 
in this new coordinate system is given in Figure [4] The an- 
gular distances Re and can then be defined simply from: 



Rq — arcsinp^ 
and. 



Ra) — arccos 



(19) 



(20) 



To increase the speed of the algorithm, these inverse trigono- 
metric functions can be made into look-up tables. 

It should be noted that after this rotation, the pixels - 
and even the tree node itself - are in general not aligned as 
they appear in Figure [4] In the above coordinate transform, 
the control vector a determines how the new coordinate vec- 
tors y' and z' are orientated with respect to the new x axis, 
x (that is, how y and z are rotated around x'). In fact, 
it should also be stressed that the HEALPix pixels are not 
aligned as in Figure [4] 6e/ore the rotation, but in fact appear 



more diamond shaped (as one can see in the maps in Figures 
5-9). We found that the exact rotation is typically unim- 
portant for the mapping, provided the angular resolution in 
the map is not significantly smaller than the opening angle 
used during the tree- walk. We discuss this further in Section 

m 

In our discussion so far, we have referred only to 'nodes', 
and their properties, but it should be stressed that some of 
the nodes will be 'leaves'. In our implementation, the leaves 
are SPH particles, and as was mentioned above, it is custom- 
ary to use the particle properties directly when evaluating 
the gravitational forces (or in our case, the column density). 
As such, we adopt the exact particle position when consid- 
ering the leaf nodes. However, although SPH particles have 
non- uniform radial column density profiles, we do not take 
this into account in our TreeCol implementation, but rather 
treat the SPH particles in the same manner as the other 
nodes, by assuming that they have a uniform column den- 
sity, and project a square in the sky, rather than a circle. 
As our node to pixel mapping is based around mass conser- 
vation (the concept behind Equation 11), we cannot simply 
use the smoothing length h to define the square (so = h 
in Gadget 2, or 2h for the definition of h in most other 
SPH codes), but instead have to define rn to conserve area, 
giving, 

1 



(21) 



Our motivation for treating the SPH particles in this way, 
is that working out the true fraction of the SPH particle's 
mass that falls within the pixels is computationally expen- 
sive, requiring a numerical integration over the overlapping 
areas, since the pixel and the SPH particle have quite dif- 
ferent shapes. 

Our choice of the HEALPix method of pixelating the 
spheres around the SPH particles was motivated by two 
main factors. First, the pixels in the HEALPix mapping are 
equal area, which simplifies any comparisons of the pixel 
properties between different maps. Second, the equal-area 
property means that increasing the pixel number is equiva- 
lent to increasing the angular resolution of the map every- 
where on the sphere. This is not the case with the traditional 
latitude-longitude discretisation, for example, in which the 
pixels at the poles have a significantly smaller area than 
their counterparts at the equator. 

Finally, for our SPH code, we use the publicly available 
code Gadget 2 ( [Spr ingel 2005), which uses an oct-tree. For 
this project, where we need to have control over the opening 
angle ^toi to show how it affects the results, we adopt the 
standard Barnes and Hut opening criterion (Barnes & Hut 



1989) rather than the 'relative error' criterion suggested by 



Springel] ( |2005| . 



3 TESTS OF TREECOL 

In this section we apply the TreeCol algorithm to two very 
different types of test problem. In the first case, we consider 
a gas cloud that is an isolated sphere, with conditions sim- 



ilar to the dense cores found in the Pipe nebula (Alves et 
aL||2007 ). For the second test problem, we consider a cloud 
that is a model of a turbulent molecular cloud, which is 
representative of the environment in which prestellar cores 
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form. These two different set-ups are typical of tfiose used 
in contemporary simulations of star and cluster formation. 

In what follows, we will use Hammer projections to dis- 
play the 47r steradian maps of column densities seen from 
given locations within the cloud. There are four types of map 
that we will show. The first is the 'true' column density map 
obtained by summing up the contribution from every single 
SPH particle in the simulation. For this type of map, it is 
customary to take into account the radial density profile of 
the SPH particles, as described by their smoothing kernel. 
However since this is not done in the TreeCol implementa- 
tion we choose not to do this here for the SPH maps. Instead 
we assume that each SPH particle has a constant column 
density defined simply by its mass, and radial extent (that 
is, the particle's smoothing length). 

The second type of map is the 'pixel- averaged' map, 
whereby we pixelate the 'true' map into a set number of 
HEALPix pixels, by averaging over the points from the Ham- 
mer projection that lie inside each pixel. This type of map 
provides a more useful measure than the 'true' maps, as 
the results of TreeCol are also stored on HEALPix grids. 
As such, TreeCol could be said to be working perfectly if it 
can recover the same column densities as those shown in the 
'pixel- aver aged' maps. 

The third type of map is simply the column density map 
produced by TreeCol. Finally, our last type of map describes 
the error in the TreeCol method. We define a fractional error 
fi for each pixel i by 



Full SPH particle map 



48 Healpix pixels 



(22) 



where ^p. is the column density of pixel i in the 'pixel- 
averaged' map, and is the column density in pixel i re- 
covered by TreeCol. 

In our tests, we will also explore the two intrinsic res- 
olutions that are at play in our implementation of TreeCol. 
The first is the number of pixels in the HEALPix sphere that 
surrounds the SPH particles, which represents the ability of 
the SPH particle to record the column density information 
that comes from the tree walk. The second resolution at play 
is the opening angle, ^toi, as this determines how accurately 
the tree is forced to look at the structure in the cloud. To- 
gether these determine the accuracy and level of detail that 
is present in the TreeCol map. 

3.1 Spherical cloud 

In the first test problem, we consider a particle located at 
the edge of a spherical, isothermal cloud with a mean mass 
density of 3 x 10~^° gcm~^, a temperature of 10 K and a 
mass of 1.33 M©. The cloud is modelled with 10,000 SPH 
particles, and hence the mass resolution is comparable to 
that used in contemporary models of cluster formation (e.g. 



Bate et al.|2003||Clark Bonnell|20Q5l|Jappsen et al.|2005 ) 
The cloud is gravitationally unbound, but confined by an ex- 
ternal pressure of 10^ K cm~^ and has been allowed to set- 
tle into hydrostatic equilibrium. It centrally condenses into 



a stable Bonnor-Ebert sphere (Ebert 



1955 



Bonnor 

7TT 



1956 



Ebert||l957| ) with a central density of 3.4 x ICT^^cm 



and an outer density of 1.7 x 10~ gcm~ at a radius of 
0.09 pc. The column density map of the sky, as seen by 
the particle at the edge, is shown in Hammer projections in 




0.0001 0.0010 0.0100 
column density [g cm~^] 

Figure 5. The column density Hammer projections for a particle 
sitting on the edge of a centrally condensed sphere. The upper left 
panel shows the column density projection from all SPH particles 
in the simulation volume. The other panels then show the same 
47r steradian map pixelated into 48, 192, and 768 HEALPixpi:sie\s. 
The pixel values are simply averages of Hammer projection points 
that lie inside each pixel's boundary. 



48 pixels; 9^^^ = 0.3 



48 pixels; ^i^i = 0-5 




0.0001 0.0010 0.0100 
column density [g cm~^] 

Figure 6. The maps recovered by our TreeCol implementation, 
for the column density distribution shown in Figure [s] The maps 
are shown for two different measures of the resolution: the opening 
angle, ^tob of the tree (a measure of how well TreeCol can 'see' 
the cloud), and the number of pixels in the HEALPix map (a 
measure of how accurately TreeCoVs results are stored). 
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Figure |5] In this Fi gure, we show the 'true' column density 
map, as calculated from the individual SPH particles that 
make up the cloud, and also the map pixelated into 48, 192, 
and 768 HEALPix pixels to create the 'pixel- averaged' maps 
described above. On the left-hand side of the Hammer pro- 
jections one can see the high column density of the centrally 
condensed core of the sphere, and on the right-hand side of 
each map one can see the edge of the sphere, and the empty 
void beyond. 

Although such a simple cloud geometry may seem triv- 
ial, it actually represents a stringent test of the TreeCol al- 
gorithm. First, the tree itself is made up of a series of boxes, 
and so the intrinsic geometries of the cloud and tree are quite 
different. Second, we would expect that the rapidly evolving 
gradient in the column densities - associated with the sharp 
edge of the cloud - will be difficult for the tree to capture, 
as the edges of the nodes will tend to be in a different place, 
as discussed above. 

Despite these difficulties, the algorithm is able to cap- 
ture the main features of the cloud fairly well. Figure [6] shows 
the TreeCol representation of the sky maps given in Fig- 
ure [5] for two different tree opening angles, ^toi = 0.3 and 
^toi = 0.5. We can see that the column density towards the 
centre of the cloud is well represented, and that the maps 
have the same overall features as those in Figure [5] high col- 
umn density on one side, and a fairly sharp decline on the 
other side where the column density falls to zero. 

Although the images in Figure [6] give an idea of the 
structure and boundaries that TreeCol is able to repro- 
duce, it is difficult to gauge the quantitative accuracy of 
the method. A better representation is shown in Figure |7| 
where we plot the relative error in the TreeCol maps. Here 
we see that the error is typically less than 10 percent when 
the column density is high, but can be as large as around 
100 percent when the column density is low, or approach- 
ing zero. The high error (around 50 percent) in the middle 
of the map (and the outer extremities) comes from the fact 
that the boundaries of the tree nodes are not necessarily 
aligned with the edge of the particle distribution. As we in- 
crease TreeCoFs ability to see the structure in the cloud, 
by reducing the opening angle, we see that the error at the 
boundary decreases. Overall, the best representation of the 
cloud's boundary (and indeed the cloud itself) is found in 
the 48 pixel map that was run with a tree opening angle of 
0.3. This is unsurprising, as the low resolution of the pixel- 
averaged map is also unable to capture the sharp fall in the 
column density at the cloud's boundary, while at the same 
time the smaller opening angle ensures that the pixels on 
the boundary are not assigned mass that belongs to further 
inside the cloud. 

In general, we see that the smaller opening angles tend 
to produce better maps for a given pixellation. This is ex- 
pected, since as the opening angle is reduced, the properties 
of the tree nodes become closer to the actual distribution 
of the particles. This is most obviously apparent in the 768 
pixel map, where we see that the map obtained for ^toi = 0.5 
contains artefacts from the underlying boxy structure of the 
tree, while the ^toi = 0.3 map is much smoother. In the 
maps with a lower number of pixels, these features are not 
so apparent as the structure of tree is more smeared out. 

Perhaps a more useful measure of the ability of 
TreeCol to sample its surroundings is the error in the av- 
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Figure 7. The relative error (computed according to Equation 
|22| > based on the difference between the maps shown in Figure [g] 
and the pixellated maps shown in Figure [s] 



erage column density in the map, as given in Table [T] For 
the spherical cloud set-up, we find that the average column 
is between 4.2 and 7 percent higher than the average in the 
true map, with the lowest resolution run (^toi = 0.5, Npix 
= 48) having the largest overall error, and the highest reso- 
lution run (^toi = 0.3, Npix = 768) having the lowest error. 
The fact that these errors are so low reflects the fact that 
the mean is dominated by the high column density regions, 
which are recovered well by TreeCol in all the resolutions we 
study. 



3.2 Turbulent clouds 

Our previous test examined the ability of the TreeCol al- 
gorithm to capture the column density variations that one 
would expect in the environment of a prestellar core. How- 
ever the test was also designed to see how well TreeCol can 
handle sharp density contrasts, and so the core was sim- 
ply placed in a vacuum, rather than the more complicated 
environment of a turbulent molecular cloud, in which the 
typical prestellar core is born (Mac Low & Klessen 2004). 



This is the focus of the test in this section. Again, we want 
to make the test problem as challenging as possible, so in 
this section we choose to examine the sky-map as seen by 
a low density particle in the cloud. For such a particle, the 
holes and filaments that characterise the turbulent cloud's 
structure should be more pronounced than they would be for 
a high density particle, and so the contrast in the column 
density map is high. 

Our cloud has a mass of 10^ M©, an initial mean number 
density of 300 cm~^ (or mean mass density 1.17 x 10~^^ g 
cm""^), and a radius of roughly 6 pc. We model the cloud 
with 2 X 10^ SPH particles. At the start of the simulation. 
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Figure 8. The column density sky-map as seen by a low-density 
particle in a turbulent molecular cloud simulation. As in Figurejs] 
the upper-left panel is obtained by adding up the contributions 
from all SPH particles in the computational volume (excluding 
the particle from which the sky is viewed). The other panels then 
show a 'pixel-averaged' view of the cloud, as would be seen if we 
only had 48, 192 and 768 pixels in our map. 



Table 1. A summary of the mean column densities in the cloud 
models presented in Sections |3.1| and |3.2| for both the true map 
(the first line) and each of the TreeCol maps. For the TreeCol re- 
sults we give the number of pixels used in the column density map 
(Npix), the opening angle of the tree (^tol)? and the percentage 
error compared to the true map from the SPH particles. Note 
that due to the way the pixel- aver aged maps are obtained (see 
Section [sj, their average column density is identical to that in 
the full SPH map, and so we do not include it here. 
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Spherical cloud 






3.060 xlO-3 






48 


0.3 
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3.274 xlO-3 
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0.3 
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3.239 xlO-3 
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0.3 


3.192 xlO-3 


4.3 
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0.5 


3.226 xlO-3 


5.4 


Turbulent cloud 






1.151 xlO-2 






48 


0.3 


1.126 xlO-2 


2.2 




192 


0.3 


1.125 xlO-2 


2.3 




768 


0.3 


1.133 xlO-2 


1.6 



we impose a turbulent velocity field on the cloud that has 
a power spectrum of the form P{k) oc k~^, and adjust the 
strength of the velocities such that the kinetic energy in 
the cloud is equal to the gravitational energy of the cloud. 
This gives an initial root-mean-squared velocity of around 3 
kms~^. The turbulence is left to freely decay in shocks as it 
creates structure in the initially uniform density cloud. 

We stop the calculation after a period of 6.4 x 10^ yr, 
by which time a combination of the turbulence and self- 
gravity has created a network of interconnected filaments 
and voids. This structure can been seen in the sky-maps 
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Figure 9. The left-hand panels show the TreeCol maps for the 
turbulent cloud set-up shown in Figure [s] for 48, 192 and 768 
pixels in the map. All maps were produced using a tree opening 
angle ^tol = 0-3. The right-hand panels show the relative error in 
the TreeCol maps. 



shown in Figure [S] where, as discussed, we position ourselves 
on a low-density particle that resides near the centre of the 
cloud. As in Figure |5] we again show how this map would 
look if it were to be de-resolved to 48, 192, and 768 HEALPix 
pixels, giving us some idea how well TreeCol can be expected 
to perform. It is already obvious from the pixel- averaged 
maps that even at the 768 pixel level, many of the very dense 
features are going to be missing from the map. Nevertheless, 
the mean column densities in the coarse, pixellated maps 
are all within 0.1 percent of the mean column in the full 
SPH map, and so they are still a good representation of the 
column density distribution in the cloud, even if they are 
unable to resolve the small-scale detail. 

The images in Figure [9] show results from TreeCol for 
this cloud, including the TreeCol column density maps and 
their associated relative errors. Given the amount of struc- 
ture in the cloud, we construct the maps in this figure while 
keeping the tree-opening angle fixed at 0.3. Overall we see 
that the algorithm behaves well, and the features present in 
the pixel- averaged maps are recovered, even at our highest 
mapping-resolution of 768 pixels. For the 2 lower-resolution 
maps (48, and 192 pixels), the errors in the maps are mainly 
small, and TreeCol typically recovers the column densities 
to around 5 percent. However, we see that the errors in the 
768 pixel map are again quite high, and for the same reasons 
as we seen in the previous test, namely that the pixellation 
of the map is too high for the adopted tree-opening angle, 
and so the structure of the tree is beginning to show in the 
map. 

Although the cloud studied here is more complicated 
than that studied in Section [3. 1| the errors in the mean col- 
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Figure 10. The dust temperature profiles for two uniform den- 
sity clouds (10~-'^^gcm~'^) of mass 1 and 10 M©, heated by the 
Black (1994) interstellar radiation field. Orange points show the 
output from the RADMC-3D Monte Carlo radiative transfer code 
- run with 80^ grid cells and 2 x 10^ photon packets - and the 
blue points denote the output from an SPH simulation that uses 
the column density information recovered by TreeCol in conjunc- 
tion with the method for calculating dust temperatures given in 
Goldsmith (2001). In the SPH simulation we use 261932 particles 
and a tree-opening angle of 0.5. The dust opacities are a combina- 
tion of Ossenkopf &; Henning ( 1994) (non-coagulated and thick ice 
mantle grains) for wavelengths longer than 1 /im, and those given 
in |Mathis, Mezger &: Panagia| ( |1983| for shorter wavelengths. 



umn density (given in Table [T]) are actually lower than they 
are in the spherical cloud, and range from 1.6 to 2.3 per- 
cent. Unfortunately, the extra small-scale structure means 
that the way in which the error relates to the number of 
pixels is not as consistent here as it was for the previous 
cloud set-up. As one moves to higher number of pixels, the 
small-scale, high-column features start to become resolved, 
but although TreeCol is able to see them, it tends to get 
their location wrong as the tree node that represents these 
regions occupies a slightly different part of the sky, and has 
a different angular extent, than the true SPH particle dis- 
tribution. So while the mass located in these small scale 
features is captured by the map, the location and spread is 
not, and as a result, one pixel may get too much column, 
while a neighbour receives too little. This is why the pixel 
error (remember that this is the absolute error) in Figure [9] 
starts to get worse as the number of pixels increases. 



3.3 Dust heated by the interstellar radiation field 

Although we have seen that TreeCol can typically deliver 
a fairly accurate column density map of the sky, there are 
situations in which the errors in the map can be as much 
as 100 percent, and so it is prudent to check whether this 
is a problem when one applies TreeCol to a real astrophysi- 



cal calculation. To this end, we look at a typical problem in 
star formation: the temperature profile of prestellar cores, 
heated by the interstellar radiation field (ISRF). This prob- 
lem has been looked at by number of authors, with aims 
ranging from deriving observational masses from dust emis- 
sion (see e.g. Stamatellos et al.|2007 ), to understanding the 



dynamics and fate of prestellar cores (Keto Caselli[2008 



2010). Furthermore, Larson (2005) suggested that the dust 



temperature at the onset of thermal coupling between the 
gas and dust could help set a characteristic Jeans mass in 
molecular clouds, which may be responsible for setting the 
characteristic mass for star formation (see also [Jappsen et 
|al.||2QQ5| ). Since calculating the dust temperatures reduces 
to evaluating the attenuation of the ISRF - which depends 
on the column density between the core and the incoming 
radiation - it is ideally suited to the TreeCol approach. We 
describe here a simple method by which this can be included 
into our TreeCol implementation. 

In the case of a static cloud in thermal equilibrium, one 
can solve for the dust temperature Td by finding the value 
that satisfies the equation of thermal balance for the dust|^ 



■Ad 



0. 



(23) 



Here Text is the dust heating rate per unit volume due to 
absorption of radiation from the ISRF and Adust is the ra- 
diative cooling rate of the dust. 

Following [Goldsmith (2001), one can express Text as 
the product of a optically thin heating rate, rext,o, and a 
dimensionless factor, x, that represents the attenuation of 
the ISRF by dust absorption: 



Text — X^extjO- 

The optically thin heating rate is given by 



rext,0 = 4:7vVp 



f 

Jo 



(24) 



(25) 



where V is the dust-to-gas ratio, p is the gas density, Ju is 
the mean specific intensity of the incident ISRF, and is 
the dust opacity in units of cm^ g~^. Provided the proper- 
ties of the radiation field and the dust do not change over 
the region in question, the integral in the above expression 
needs to be computed only at the start of the simulations, 
and then simply used as a pre- factor. In our application, we 



adopt the ISRF given in Black 



ities of Ossenkopf & Henning (1994) (non-coagulated and 



(1994), and the dust opac- 



thick ice mantle grains) for wavelengths longer than 1 /xm, 
and those from Mathis, Mezger & Panagia ( 1983 ) at shorter 
wavelengths. 

The attenuation factor x, is then given by the equation 



xiNu) = 



4:7V Jvf^v exp (— A^iyS) dzy 
47r Jui^u diy 



(26) 



where S = 1.4mpA^H, TUp is the proton mass, and A^h is the 
number density of hydrogen nuclei. This equation can also 

^ Note that we assume here for simplicity that energy transfer 
from the gas to the dust (or vice versa) does not significantly 
affect the dust temperature. This assumption is valid whenever 
that the gas and dust temperatures are equal or the gas density 
is low enough that the coupling between dust and gas is weak. 
For a more detailed treatment, see e.g. [Glover Clark| ( |20lTa| . 
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be solved for a wide range of different values of A^h, and 
results can then be stored in a look-up table to be called 
during the simulation. 

For the dust cooling rate Adust, one has to solve, 



Adust (Td) = AnVp / B,{Td)K, diy 



f 

Jo 



(27) 



where Bu{Td) is the Planck function for a temperature Td. 
For our choice of dust opacities, we find that the resulting 
cooling rate is well fit by the function 



Adust (Td) = 4.68 X 10"^'T|nergs"' cm" 



(28) 



for dust temperatures 5 < Td < 100 K (Glover & Clark 



2011a). 



Using this framework, it is then straightforward to solve 
for the dust temperature of each SPH particle. First, a value 
of X can be calculated for each TreeCol pixel, based on its 
column density. Provided the pixels have an equal area (as 
is the case with the HEALPix scheme employed here), the 
arithmetic mean of these x values can then be used in Equa- 
tion [24] to compute the value of Text for the particle. This 
is then used in Equation [23] along with the cooling rate in 
Equation |28| to compute the dust temperature associated 
with the SPH particle. 

An example of this technique, applied to two clouds, 
is shown in Figure |10| Given that the errors in the spheri- 
cal cloud test in Section [3.1 [w ere higher than those for the 
turbulent cloud in Sectiori ]3T2l we again choose a spherical, 
isolated cloud as our test bed, so that we can check whether 
these errors are important in a typical astrophysical set-up. 
The clouds both have a uniform density of 10~^^gcm~^, 
but differ in mass, with one having a mass of 1 M©, and the 
other 10 M©. Figure [Tq| shows the resulting radial temper- 
ature profiles for the clouds as computed using TreeCol in 
conjunction with the method described above, for 261932 
SPH particles. This number of particles is comparable to the 
number used in simulations of the collapse of prestellar cores 
( Bate|1998| [Clark Bonnell|2005||Bate|2010 ). We adopt an 
opening angle of 0.5 in this test, and we use 48 pixels in 
the column density map. For comparison we also show the 
results from RADMC-3E[^ a Monte Carlo radiative transfer 
code, performed using 20 million photon packets on a 80^ 
uniform grid. This number of grid cells is chosen to ensure 
that the number of cells inside the sphere is the roughly the 
same as the number of SPH particles. Note that scattering is 
switched off in the RADMC-3D run, and neither is it taken 
into account in the x factor used in the TreeCol implemen- 
tation. Finally, for V we take the standard value for solar 
metallicity gas. 

Comparing the results from the TreeCol method to 
those from RADMC-3D, we see that the former recovers the 
general properties of the dust temperature profile very well. 
The temperatures are higher on the outskirts of the cloud, 
where the gas is exposed to more radiation, and cooler in the 
centre, where the gas is better shielded. Also, the lower mass 
(and therefore overall lower mean column density) cloud is 
hotter than the higher mass cloud, as every part of it sees 
more of the ambient radiation field. 



^ http:/ /www. ita.uni-heidelberg.de/~dullemond/software/radmc- 
3d/ 



Most importantly however, we see that the results from 
our Tree Co/-based dust temperature calculation lie within 
the scatter of the Monte Carlo code, which itself is only 
around 0.5K. However note that while the error in the 
RADMC-3D results will get better as more photon pack- 
ets are used to model the radiation field, we find that the 
TreeCol method does not get significantly more accurate as 
the either the opening angle is decreased, or the resolution in 
the pixel map is increased. The source of the discrepancy is 
that RADMC-3D is able to treat the absorption, re-emission 
and then re-absorption of the incoming photons, while the 
TreeCol method only models the initial absorption. In other 
words, what we doing in the TreeCol implementation is only 
an approximation to the full radiative transfer problem. As 
such, the main source of error in the TreeCol results shown 
in Figure [Tq| is not the error in the column density maps ob- 
tained during the tree-walk, but the approximations made 
in using these column densities to calculate the dust tem- 
perature. 



4 POTENTIAL APPLICATIONS 

There are a number of different areas of computational as- 
trophysics in which we expect the TreeCol algorithm to be 
useful. One important example is the case examined at in 
the previous section: modelling the penetration of the in- 
terstellar radiation field into a prestellar core, and the re- 
sultant heating of the dust. At high gas densities, the gas 
and dust temperatures within prestellar cores are closely 
coupled, and the dust plays a crucial role in regulating the 
thermal behaviour of the gas. An accurate determination of 
how the equilibrium dust temperature changes during the 
dynamical evolution of a prestellar core is therefore of great 
importance in studies of the stability of such cores (see e.g. 
Keto k, CasellipOlO ) and so a method to compute this ef- 



ficiently within a high-resolution three-dimensional simula- 
tion is clearly of great utility. 

Another example of the kind of problem for which we 
expect TreeCol to be a valuable approach is the attenuation 
of ultraviolet radiation within simulated molecular clouds. 
UV radiation plays a central role in regulating much of the 



astrochemistry within molecular clouds (see e.g. Hollenbach 



[fc Tielens^ l999), and yet current numerical models of cloud 
formation typically use very simplistic methods to model the 
radiation field. For instance, the studies by |Dobbs, Bonnell] 
k Pringlel (2006) and iDobbs et al.l ([20081 use an approxi- 



mation in which the absorbing column at any point in the 
simulated interstellar medium is estimated by multiplying 
the local gas density by a fixed shielding length. |Gnedin,| 
Tassis, & Kravtsov (2009) use a similar local approxima- 



tion, but rather than keeping the shielding length fixed, 
determine it by examining the local density and velocity 



gradients (see also Gnedin & Kravtsov 2011). Finally, the 
detailed models presented in Glover et al. (2010) use a "six- 



ray" approach in which the column density of gas between 
each cell and the boundaries of the simulation is calculated 
along six lines of sight, taken to run parallel to the coordi- 
nate axes. TreeCol represents a significant improvement in 
accuracy over all of these approaches, and we have already 
begun to make use of it in our work on star formation within 
molecular clouds ( [Glover k, Clark||2011b| . 
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A further example of a possible application for our 
method is modelling of the X-ray heating of the intergalac- 
tic medium (IGM) prior to the epoch of reionization. At 
high redshifts, X-rays produced by sources such as massive 
X-ray binaries (e.g. Glover & Brand 2003| Mirabel et al. 
2011) or mini-quasars ( Zaroubi et al.|2007 ) heat the neutral 
IGM, producing distinctive signatures in the 21-cm back- 
ground (see e.g. Prit chard Furlanetto|2007 ). As the heat- 
ing is dominated by soft X-rays, with relative short mean 
free paths, the resulting temperature distribution of the gas 
is inhomogeneous. Accurate modelling of the temperature 
distribution is necessary if we are to make optimal use of the 
information provided by the 21-cm background, and this in 
turn requires us to compute the column density distribution 
around numerous sources in a computationally efficient man- 
ner, an ideal application for an approach such as TreeCol. 

Of course, we anticipate that TreeCol will be useful in 
areas besides those listed here, but hope that this brief dis- 
cussion gives a general idea of the situations in which an 
efficient method for estimating column densities within nu- 
merical simulations is likely to be useful. 



5 A NOTE ON PERFORMANCE 

The fact that TreeCol makes use of the pre-existing gravi- 
tational tree has several advantages when it comes to the 
performance and implementation of the algorithm. First, 
it means that the algorithm will scale with increasing par- 
ticle number in the same way as the tree scales. As dis- 
cussed above, this is typically N log N for most tree codes 
(note however that some techniques may make it possible 
to achieve even better scaling, for example the features dis- 
cussed in Gafton & Rosswog 2011). In contrast, ray-tracing 



Table 2. Parallel efficiency of the test calculation described in 
the text, performed both with and without TreeCol, as a function 
of the number of processors, A^p. The efficiency is normalized to 
unity for A^p = 64. 



- the method most commonly used for obtaining column 
density estimates - typically scales as N^^^ . Further, the im- 
plementation of TreeCol is relatively easy, as the three quan- 
tities that are required to capture a node's column density 
contribution (namely the mass, relative position and angular 
size) are already used in the calculation of the gravitational 
forces. As such, implementing TreeCol requires few (if any) 
structural changes to the underlying tree code, and mainly 
reduces to adding a call to a function that handles the map- 
ping of the node's contribution to the target particle's col- 
umn density map. As such, TreeCol can also be implemented 
easily in grid-based fluid codes that use a tree-scheme to cal- 
culate gravitational forces. For example, the method has re- 
cently been implemented into FLASH (Afiorve, private com- 
munication) . 

Another useful feature of such a tree-based algorithm is 
that it is naturally adaptive: every particle in the cloud will 
see its immediate surroundings at a set angular resolution, 
regardless of the physical size of the density structure at the 
local scale of the particle. As an example, one can take the 
case of a protostellar disc sitting in a collapsing protostellar 
core. In TreeCol^ the particles on the edge of the disc will 
be able to pick up the local drop in column density arising 
from the disc-envelope boundary, while those on the edge 
of the core will pick up the boundary between the core and 
the ambient cloud. This is a natural consequence of the way 
that a gravitational tree solver breaks up the cloud into a 
hierarchically nested grid. 





With treecol 


Without treecol 


64 


1.00 


1.00 


128 


0.61 


0.61 


256 


0.42 


0.39 


512 


0.22 


0.28 



Also, it should be noted that tree codes tend to paral- 
lelize well, and parallel tree gravity is now a standard feature 
of contemporary SPH codes. As such, TreeCol can naturally 
take advantage of the speed-up that parallelization offers. 
In order to verify that the additional work that must be 
done during the tree- walk need not adversely affect the par- 
allel scaling of the code, we have studied the scaling of a 
representative test problem both with and without the use 
of TreeCol. For our test, we modelled the evolution of an 
isothermal, turbulent molecular cloud using the Gadget 2 
SPH code, with two million SPH particles. We investigated 
how the wallclock time required to model the cloud for a 
specified period varied as we increased A^p, the number of 
processors used to run the code. For each value of A^p, we 
performed two simulations: one in which TreeCol was used 
to compute the column densities, and one in which it was 
not. We defined a parallel efficiency for each simulation as 

where T is the elapsed wallclock time for the simulation, and 
T64 is the wallclock time for the A^p = 64 simulation. (Note 
that by this definition, rf — 1 for the A^p = 64 runs). We 
performed simulations with A^p = 64, 128, 256 and 512. The 
results are summarized in Table [21 

Although the parallel efficiency of Gadget 2 for this 
problem is not particularly high to begin with, it is clear 
that the use of TreeCol does not have very much influence 
on the scaling, suggesting that the additional communica- 
tions overhead is not a significant problem in comparison to 
the inherent difficulties involved in properly load-balancing 
a simulation of this type. 

However, as with any computational method, there are 
drawbacks to our approach. One of the main downsides 
of the TreeCol method is that it can introduce a signifi- 
cant memory overhead. The exact memory requirements of 
TreeCol can vary considerably, depending on the type of tree 
employed by the code, how the column density information 
is being used, and whether the code is parallelized using 
the Message Passing Interface (MPI) protocol, or using the 
OpenMP protocol. In Gadget 2 for example - a code that 
is MPI parallelized - copies of the SPH particles on a given 
CPU are sent to all the other CPUs to get the contributions 
to the gravitational force from the particles that reside there. 
An implementation in Gadget 2 must then store two copies 
of the column density map for each particle: one that is 
broadcast to the other CPUs to pick up their contributions, 
and one that resides on the home CPU that collects the lo- 
cal contributions and stores the final total. Other tree codes 
parallelized using MPI work differently, sending the neces- 
sary information from the other CPUs to the CPU with the 
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target particle, requiring that only one map be stored per 
particle. In fact, if the column density map is only needed 
once - for example, to compute a mean extinction - then 
only the particle currently walking the tree needs a column 
density map. In this case the information stored in the map 
can be used at the end of particle's walk, and the map can 
then be cleared in preparation to be re-used in the next 
particle's tree-walk. So depending on the application, and 
on the code, the memory requirements for TreeCol can be 
anywhere from one map per parallel task, to two maps per 
particle. 
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